KNN-CF Approach: Incorporating Certainty Factor to kNN Classification
نویسنده
چکیده
Abstract—KNN classification finds k nearest neighbors of a query in training data and then predicts the class of the query as the most frequent one occurring in the neighbors. This is a typical method based on the majority rule. Although majority-rule based methods have widely and successfully been used in real applications, they can be unsuitable to the learning setting of skewed class distribution. This paper incorporates certainty factor (CF) measure to kNN classification, called kNN-CF classification, so as to deal with the above issue. This CF-measure based strategy can be applied on the top of a kNN classification algorithm (or a hot-deck method) to meet the need of imbalanced learning. This leads to that an existing kNN classification algorithm can easily be extended to the setting of skewed class distribution. Some experiments are conducted for evaluating the efficiency, and demonstrate that the kNN-CF classification outperforms standard kNN classification in accuracy.
منابع مشابه
A Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection
K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...
متن کاملImproving K-nearest-neighborhood based Collaborative Filtering via Similarity Support
Collaborative Filtering (CF) is the most popular choice when implementing personalized recommender systems. A classical approach to CF is based on K-nearest-neighborhood (KNN) model, where the precondition for making recommendations is the KNN construction for involved entities. However, when building KNN sets, there exits the dilemma to decide the value of K --a small value will lead to poor r...
متن کاملEvolutionary fuzzy k-nearest neighbors algorithm using interval-valued fuzzy sets
One of the most known and effective methods in supervised classification is the K-Nearest Neighbors classifier. Several approaches have been proposed to enhance its precision, with the Fuzzy K-Nearest Neighbors (Fuzzy-kNN) classifier being among the most successful ones. However, despite its good behavior, Fuzzy-kNN lacks of a method for properly defining several mechanisms regarding the repres...
متن کاملPARALLEL kNN ON GPU ARCHITECTURE USING OpenCL
In data mining applications, one of the useful algorithms for classification is the kNN algorithm. The kNN search has a wide usage in many research and industrial domains like 3-dimensional object rendering, content-based image retrieval, statistics, biology (gene classification), etc. In spite of some improvements in the last decades, the computation time required by the kNN search remains the...
متن کاملBreast Cancer Diagnosis from Perspective of Class Imbalance
Introduction: Breast cancer is the second cause of mortality among women. Early detection is the only rescue to reduce the risk of breast cancer mortality. Traditional methods cannot effectively diagnose tumor since they are based on the assumption of well-balanced dataset.. However, a hybrid method can help to alleviate the two-class imbalance problem existing in the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Intelligent Informatics Bulletin
دوره 11 شماره
صفحات -
تاریخ انتشار 2010